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The availability of quantum annealing devices with hundreds of qubits has made the experimental 
demonstration of a quantum speedup for optimization problems a coveted, albeit elusive goal. Going 
beyond earlier studies of random Ising problems, here we introduce a method to construct a set of 
frustrated Ising-model optimization problems with tunable hardness. We study the performance 
of a D-Wave Two device (DW2) with up to 503 qubits on these problems and compare it to a 
suite of classical algorithms, including a highly optimized algorithm designed to compete directly 
with the DW2. The problems are generated around predetermined ground-state configurations, 
called planted solutions, which makes them particularly suitable for benchmarking purposes. The 
problem set exhibits properties familiar from constraint satisfaction (SAT) problems, such as a peak 
in the typical hardness of the problems, determined by a tunable clause density parameter. We 
bound the hardness regime where the DW2 device either does not or might exhibit a quantum 
speedup for our problem set. While we do not find evidence for a speedup for the hardest and most 
frustrated problems in our problem set, we cannot rule out that a speedup might exist for some of 
the easier, less frustrated problems. Our empirical findings pertain to the specific D-Wave processor 
and problem set we studied and leave open the possibility that future processors might exhibit a 
quantum speedup on the same problem set. 


I. INTRODUCTION 

Interest in quantum computing is motivated by the 
potential for quantum speedup the ability of quan¬ 
tum computers, aided by uniquely quantum features such 
as entanglement and tunneling, to solve certain compu¬ 
tational problems in a manner that scales better with 
problem size than is possible classically. Despite tremen¬ 
dous progress in building small-scale quantum informa¬ 
tion processors [1] , there is as of yet no conclusive exper¬ 
imental evidence of a quantum speedup. For this reason, 
there has been much recent interest in building more spe¬ 
cialized quantum information processing devices that can 
achieve relatively large scales, such as quantum simula¬ 
tors [2] and quantum annealers [3, 4]. The latter are 
designed to solve classically hard optimization problems 
by exploiting the phenomenon of quantum tunneling [5- 
14]. Here we report on experimental results that probe 
the possibility that a putative quantum annealer may be 
capable of speeding up the solution of certain carefully 
designed optimization problems. We refer to this either 
as a limited or as a potential quantum speedup, since we 
study the possibility of an advantage relative to a portfo¬ 
lio of classical algorithms that either “correspond” to the 
quantum annealer (in the sense that they implement a 
similar algorithmic approach running on classical hard¬ 
ware), or implement a specific, specialized classical al¬ 
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gorithm [15]. In addition, for technical reasons detailed 
below we must operate the putative quantum annealer 
in a suboptimal regime. With these caveats in mind, 
we push the experimental boundary in searching for a 
quantum advantage over a class of important classical 
algorithms, which includes simulated classical and quan¬ 
tum annealing [16, 17], using quantum hardware. We 
achieve this by designing Ising model problems that ex¬ 
hibit frustration, a well-known feature of classically-hard 
optimization problems [18, 19]. In doing so we go be¬ 
yond the random spin-glass problems of earlier studies 
[15, 20], by ensuring that the problems we study exhibit 
a degree of hardness that we can tune, and have at least 
one “planted” ground state that we know in advance. 

The putative quantum annealer used in our work is the 
D-Wave Two (DW2) device [21]. This device is designed 
to solve optimization problems by evolving a known ini¬ 
tial configuration — the ground state of a transverse field 
Hx = where af is the Pauli spin-1/2 matrix 

acting on spin i — towards the ground state of a classical 
Ising-model Hamiltonian which serves as a cost function 
that encodes the problem that is to be solved: 

Hlsing = 'y ] Jij CT.i <Tj + hjCj ■ ( 1 ) 

(i,j)€.E i£V 

The variables {erf} denote either classical Ising-spin vari¬ 
ables that take values ±1 or Pauli spin-1/2 matrices, the 
{Jij} are programmable coupling parameters, and the 
{hi} are programmable local longitudinal fields. The N 
spin variables are realized as superconducting flux qubits 
and occupy the vertices V of a graph G with edge set 
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E. Here G is the D-Wave “Chimera” hardware graph 
[21, 22], Further details, including a visualization of 
the Chimera graph and the annealing schedule that in¬ 
terpolates between Hx and -ffi s i ng , are provided in Ap¬ 
pendix 

The dual questions of the computational power and 
underlying physics of the D-Wave devices — the 512- 
qubit DW2 and its predecessor, the 128-qubit DW1 - 
have generated a fair amount of interest and debate. A 
major concern is to what extent quantum effects deter¬ 
mine the performance of these devices, given that they 
are inherently noisy and operate at temperatures (~20 
mK) where thermal effects are expected to play a signifi¬ 
cant role, causing decoherence, excitation and relaxation. 
Several studies have addressed these concerns [3, 20, 23- 
32] . Entanglement has been experimentally detected dur¬ 
ing the annealing process [33], and multiqubit tunneling 
involving up to 8 qubits has been demonstrated to play 
a functional role in determining the output of a DW2 de¬ 
vice programmed to solve a simple non-convex optimiza¬ 
tion problem [34]. However, the role of quantum effects 
in determining the computational performance of the D- 
Wave devices on hard optimization problems involving 
many variables remains an open problem. A direct ap¬ 
proach to try to settle the question is to demonstrate 
a quantum speedup. Such a demonstration has so far 
been an elusive goal, possibly because the random Ising 
problems chosen in previous benchmarking tests [15, 20] 
were too easy to solve for the classical algorithms against 
which the D-Wave devices were compared; namely such 
problems exhibit a spin-glass phase only at zero temper¬ 
ature [35] . The Sherrington-Kirkpatrick model with ran¬ 
dom ±1 couplings, exhibiting a positive spin-glass critical 
temperature, was tested on a DW2 device, but the prob¬ 
lem sizes considered were too small (due to the need to 
embed a complete graph into the Chimera graph) to test 
for a speedup [31]. The approach we outline next allows 
us to directly probe for a quantum speedup using frus¬ 
trated Ising spin glass problems with a tunable degree of 
hardness, though we do not know whether these problems 
exhibit a positive-temperature spin-glass phase. 
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FIG. 1. Examples of randomly generated loops and 
couplings on the DW2 Chimera graph. Top: qubits 
and couplings participating in the loops are highlighted in 
green and purple, respectively. Only even-length loops are 
embeddable on the Chimera graph. Bottom: distribution of 
J values for a sample problem instance with N = 126 spins 
and edges, and 101 loops. It is virtually impossible to recover 
the loop-Hamiltonians Hj from a given ffising. The couplings 
are all eventually rescaled to lie in [—1,1], We always set the 
local fields hi to zero as non-zero fields tended to make the 
problems easier. 



II. FRUSTRATED ISING PROBLEMS WITH 
PLANTED SOLUTIONS 

In this section we introduce a method for generating 
families of benchmark problems that have a certain de¬ 
gree of “tunable hardness”, achieved by adjusting the 
amount of frustration, a well-known concept from spin- 
glass theory [36] . In frustrated optimization problems no 
configuration of the variables simultaneously minimizes 
all terms in the cost function, often causing classical al¬ 
gorithms to get stuck in local minima, since the global 
minimum of the problem satisfies only a fraction of the 
Ising couplings and/or local fields. We construct our 
problems around “planted solutions” - an idea borrowed 
from constraint satisfaction (SAT) problems [37, 38]. The 


planted solution represents a ground state configuration 
of Eq. (1) that minimizes the energy and is known in 
advance. This knowledge circumvents the need to verify 
the ground state energy using exact (provable) solvers, 
which rapidly become too expensive computationally as 
the number of variables grows, and which were employed 
in earlier benchmarking studies [15, 20]. 

To create Ising-type optimization problems with 
planted solutions, we make use of frustrated “local Ising 
Hamiltonians” Hj, i.e., Ising problem instances defined 
on sub-graphs of the Chimera graph in lieu of the clauses 
appearing in the SAT formulas. The total Hamiltonian 
for each problem is then of the form Hising = IZjLi Hj, 
where the sum is over the (possibly partially overlapping) 
local Hamiltonians. Similarly to SAT problems, the size 
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of these local Hamiltonians, or clauses, does not scale 
with the size of the problem. Moreover, to ensure that 
the planted solution is a ground state of the total Hamil¬ 
tonian, we construct the clauses so that each is minimized 
by that portion of the planted solution that has support 
over the corresponding subgraph. The planted solution 
is therefore determined prior to constructing the local 
Hamiltonians, by assigning random values to the bits on 
the nodes of the graph. The above process generates a 
Hamiltonian with the property that the planted solution 
is a simultaneous ground state of all the frustrated local 
Hamiltonians. 

The various clauses Hj can be generated in many dif¬ 
ferent ways. This freedom allows for the generation of 
many different types of instances, and here we present 
one method. An IV-qubit M-clause instance is generated 
as follows. 

1. A random configuration of N bits corresponding to 
the participating spins of the Chimera graph is gen¬ 
erated. This configuration constitutes the planted 
solution of the instance. 

2. M random loops are constructed along the edges 
of the Chimera graph. The loops are constructed 
by placing random walkers on random nodes of the 
Chimera graph, where each edge is determined at 
random from the set of all edges connected to the 
last edge. The random walk is terminated once 
the random walker crosses its path, i.e., when a 
node that has already been visited is encountered 
again. If this node is not the origin of the loop the 
“tail” of the path is discarded. Examples of such 
loops are given in Fig. 1. We distinguish between 
“short loops” of length i = 4, 6, and “long loops” of 
length £ > 8, as these give rise to peaks in hardness 
at different loop densities. Here we focus on long 
loops; results for short loops, which tend to gener¬ 
ate significantly harder problem instances, will be 
presented elsewhere. 

3. On each loop, a clause Hj is defined by assigning 
Jij = ±1 couplings to the edges of the loop in such 
a way that the planted solution minimizes Hj. As 
a first step, the J^’s are set to the ferromagnetic 
Jij = —SiSj, where the S; are the planted solution 
values. One of the couplings in the loop is then 


1 To see that the planted solution minimizes the total Hamilto¬ 
nian, assume that a configuration x * is a minimum of fj (x) for 
all j, where {/',(.7:)} is a set of arbitrary real-valued functions. 
Then, by definition, for each j , fj(x) > fj(x *) for all possible 
configurations x. Let us now define f(x) = JA fj(x). It follows 
then that f(x) > JA fj( x *)- Since also f(x t ) = JA fj( x *)> x * 
is a minimizing configuration of /( x). 

2 Somewhat confusingly from our perspective of utilizing frustra¬ 
tion, such Hamiltonians are sometimes called “frustration-free” 

[ 39 ]. 


chosen at random and its sign is flipped. This en¬ 
sures that no spin configuration can satisfy every 
edge in that loop, and the planted solution remains 
a global minimum of the loop, but is now a frus¬ 
trated ground state. 

4. The total Hamiltonian is then formed by adding up 
the M -loop clauses Hj. Note that loops can par¬ 
tially overlap, thereby also potentially “canceling 
out” each other’s frustration, a useful feature that 
will give rise to an easy-hard-easy pattern we dis¬ 
cuss below. Since the planted solution is a ground 
state of each of the i?j’s, it is also a ground state 
of the total Hamiltonian Hj s i ng . 

Ising-type optimization problems with planted solu¬ 
tions, such as those we have generated, have several at¬ 
tractive properties that we utilize later on: i) Having a 
ground-state configuration allows us to readily precom¬ 
pute a measure of frustration, e.g., the fraction of frus¬ 
trated couplings of the planted solution. We shall show 
that this type of measure correlates well with the hard¬ 
ness of the problem, as defined in terms of the success 
probability of finding the ground state or the scaling 
of the time-to-solution. ii) By changing the number of 
clauses M we can create different classes of problems, 
each with a “clause density” a = M/N, analogous to 
problem generation in SAT. The clause density can be 
used to tune through a SAT-type phase transition [40], 
i.e., it may be used to control the hardness of the gen¬ 
erated problems. Here too, we shall see that the clause 
density plays an important role in setting the hardness of 
the problems. Note that when the energy is unchanged 
under a spin-flip the solution is degenerate, so that our 
planted solution need not be unique. 


III. ALGORITHMS AND SCALING 

A judicious choice of classical algorithms is required to 
ensure that our test of a limited or potential quantum 
speedup is meaningful. We considered (i) simulated an¬ 
nealing (SA), a well-known, powerful and generic heuris¬ 
tic solver [16]; (ii) simulated quantum annealing (SQA) 
[8-11], a quantum Monte Carlo algorithm that has been 
shown to be consistent with the D-Wave devices [20, 32]; 


3 To see that there can be no spin configuration with an energy 
lower than that of the planted solution, consider a given loop and 
a given spin in that loop; note that every spin participates in two 
couplings. Either both couplings are satisfied after the sign flip, 
or one is satisfied and the other is not. Correspondingly, flipping 
that spin will thus either raise the energy or leave it unchanged. 
This is true for all spins in the loop. 

4 Note that for small values of M, the number of spins actually 
participating in an instance will be smaller than N, the number 
of spins on the graph from which the clauses are chosen. 




4 


(iii) the Shin-Smolin-Smith-Vazirani (SSSV) thermal ro¬ 
tor model, that was specifically designed to mimic the D- 
Wave devices in their classical limit [26]; (iv) the Hamze- 
Freitas-Selby (HFS) algorithm [41, 42], an algorithm that 
is fine-tuned for the Chimera graph and appears to be 
the most competitive at this time. Of these, the HFS 
algorithm is the only one that is designed to exploit the 
scaling of the treewidth of the Chimera graph (see Ap¬ 
pendix ) , which renders it particularly efficient in a 
comparison against the D-Wave devices [43]. 

The D-Wave devices and all the algorithms we consid¬ 
ered are probabilistic, and return the correct ground state 
with some probability of success. We thus perform many 
runs of the same duration r for a given problem instance, 
and estimate the success probability empirically as the 
number of successes divided by the number of runs. This 
is repeated for many instances at a given clause density a 
and number of variables N, and generates a distribution 
of success probabilities. Let p( A) denote the success prob¬ 
ability for a given set of parameters A = {N, a, q}, where 
q denotes the gth percentile of this distribution; e.g., half 
the instances for given N and a have a higher empirical 
success probability than the median p(N, a, 0.5). The 
number of runs required to find the ground state at least 
once with a desired probability pd is [15, 20] 


r(A) 


log(l - p d ) 
log(l-p(A)) ’ 


( 2 ) 


and henceforth we set pd = 0.99. Correspondingly, the 
time-to-solution is TTS(A) = r(A)r, where for the D- 
Wave device r signifies the annealing time t a (at least 
20/is), while for SA, SQA, and SSSV r is the number of 
Monte Carlo sweeps s (a complete update of all spins) 
multiplied by the time per sweep tx for algorithm X. 
For SA, we further distinguish between using it as a solver 
(SAS) or as an annealer (SAA): in SAS mode we keep 
track of the energies found along the annealing schedule 
(which we take to be linear in the inverse temperature 
fi) and take the lowest, while in SAA mode we always 
use the final energy found. Thus SAA can never be bet¬ 
ter than SAS, but is a more faithful model of an analog 
annealing device. A similar distinction can be made for 
SQA (i.e., SQAA and SQAS), but we primarily consider 
the annealer version since it too more closely mimics the 
operation of DW2 (note that SAA and SQAA were also 
the modes used in Refs. [15, 20]; SQAS results are shown 
in Appendix ). For the HFS algorithm, TTS(A) is calcu¬ 
lated directly from the distribution of runtimes obtained 
from 10 5 6 identical, independent executions of the algo¬ 
rithm. Further timing details are given in Appendix 
We next briefly discuss how to properly address re¬ 
source scaling [15]. The D-Wave devices use N qubits 


5 We prefer to define r in this manner, rather than rounding it as 
in [15, 20], as this simplifies the extraction of scaling coefficients. 

6 In our simulations tsa = 3.54//S, tqqa = 9.92/rs, and rsssv — 
10.34/rs. 
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FIG. 2. Time to solution as a function of clause den¬ 
sity. Shown is TTS(L, a, 0.5) (log scale) for (a) DW2 and 
(b) HFS, as a function of the clause density. The different 
colors represent the different Chimera subgraph sizes, which 
continue to L = 12 in the HFS case. In both cases there is 
a clear peak. From the HFS results we can identify the peak 
position as being at a = 0.17 ± 0.01, which is consistent with 
the peak position in the DW2 results. Error bars represent 
2 <j confidence intervals. 


and O(N) couplers to compute the solution of the input 
problem. Thus, it uses resources which scale linearly in 
the size of the problem, and we should give our classical 
solvers the same opportunity. Namely, we should allow 
for the use of linearly more cores and memory for our 
classical algorithms as we scale the problem size. For an¬ 
nealers such as SA, SQA, and SSSV, this is trivial as we 
can exploit the bipartite nature of the Chimera graph to 
perform spin updates in parallel so that each sweep takes 
constant time and a linear number of cores and memory. 
The HFS algorithm is not perfectly parallelizable but as 
we explain in more detail in Appendix V, we take this 
into account as well. Finally, note that dynamic pro¬ 
gramming can always find the true ground state in a time 
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FIG. 3. Frustration fraction. Shown is the fraction of frus¬ 
trated couplings (the number of frustrated couplings divided 
by the total number of couplings, where a frustrated coupling 
is defined with respect to the planted solution) as a function 
of clause density for different Chimera subgraphs Cl, in the 
case of loops of length > 8, averaged over the 100 instances 
for each given a and N. There is a broad peak at a ~ 0.25. 
This is the clause density at which there is the largest frac¬ 
tion of frustrated couplings, and is near where we expect the 
hardest instances to occur, in good agreement with Fig. 2. 


that is exponential in the Chimera graph treewidth, i.e., 
that scales as exp(cv^V) [22]. The natural size scale in 
our study is the square Chimera subgraph Cl comprising 
L 2 unit cells of 8 qubits each, i.e., N = 8 L 2 . Therefore, 
henceforth we replace N by L so that A = {L, a , q}. 


IV. PROBING FOR A QUANTUM SPEEDUP 

We now come to our main goal, which is to probe 
for a limited or potential quantum speedup on our frus¬ 
trated Ising problem set. We reserve the term “limited 
speedup” for our comparisons with SA, SQA, and SSSV, 
while the term “potential speedup” refers to the com¬ 
parison with the HFS algorithm, which unlike the other 
three algorithms, does not implement a similar algorith¬ 
mic approach to a quantum annealer. 


A. Dependence on clause density 

We first analyze the effect of the clause density. This 
is shown in Fig. 2 , where we plot TTS(L, a, 0.5) for the 
DW2 and the HFS algorithm. We note that the worst- 
case TTS of ~ 10ms is smaller than that observed for 


7 In this study we focus mostly on the median since with 100 in¬ 
stances per setting, higher quantiles tend be noise-dominated. 
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FIG. 4. Sketch of the relation between the log(TTS) 
curves for optimal and suboptimal annealing times. 

The annealing time needs to be optimized for each prob¬ 
lem size. Blue represents the TTS with a size-independent 
annealing time t a . Red represents the optimal TTS corre¬ 
sponding to having an optimal size-dependent annealing time 
ta Pt {L), i.e. the lower envelope of the full series of fixed an¬ 
nealing time TTS curves. This curve need not be linear as de¬ 
picted, though we expect it to be linear for NP-hard problems. 
The blue line upper bounds the red line since by definition 
TTSdw2(A, t° pt (L)) < TTSdw2(A, t a )- The vertical dotted 
line represents the problem size L* at which t a = ta pt (L*). To 
the left of this line t a > t° pt and the slope of the fixed-t a TTS 
curve lower-bounds the slope of the optimal TTS curve, since 
for very small problem sizes a large t a results in insensitivity 
to problem size, and the success probability is essentially con¬ 
stant. The opposite happens to the right of this line, where 
t a < ta pt , and where the success probability rapidly drops 
with L at fixed t a . 


random Ising problems in Ref. [15] [~ 100ms for range 7, 
Cs, and q = 0.5 (median)]. However, as we shall demon¬ 
strate, the classical algorithms against which the DW2 
was benchmarked in Ref. [15] (SA and SQA) scale sig¬ 
nificantly less favorably in the present case, i.e., whereas 
in Ref. [15] no possibility of a limited speedup against 
SA was observed for the median, here we will find that 
such a possibility remains. In this sense, the problem 
instances considered here are relatively harder for the 
classical solvers than those of Ref. [15]. 

For our choice of random loop characteristics, the time- 
to-solution peaks at a clause density a « 0.17, reflecting 
the hardness of the problems in that regime. To cor¬ 
relate the hardness of the instances with their degree 
of frustration, we plot the frustration fraction, defined 
as the ratio of the number of unsatisfied edges with re¬ 
spect to the planted solution to the total number of edges 
on the graph, as a function of clause density. The frus¬ 
tration fraction curve, shown in Fig. 3 , has a peak at 
a ss 0.25, confirming that frustration and hardness are 
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FIG. 5. Median time-to-solution over instances. Plotted is TTS(L, q, 0.5) (log scale) as a function of the Chimera 
subgraph size Cl, for a range of clause densities and for all solvers we tested. Note that only the scaling matters and not 
the actual TTS, since it is determined by constant factors that vary from processor to processor, compiler options, etc. All 
algorithms’ timing reflects the result after accounting for parallelism, as described in Appendix . Error bars represent 2 a 
confidence intervals. The DW2 annealing time is t a = 20/xs. 
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FIG. 6. Speedup ratio. Plotted is the median speedup ratio Sx(L,a, 0.5) (log scale) as defined in Eq. ( ) for all algorithms 
tested. A negative slope indicates a definite slowdown for the DW2. A positive slope indicates the possibility of an advantage 
for the DW2 over the corresponding classical algorithm. This is observed for a > 0.4 in the comparison to SAS, SQA, SSSV, 
and HFS (see Fig. for a more detailed analysis). Error bars represent 2<r confidence intervals. 


indeed correlated. The hardness peak is reminiscent of 
the analogous situation in SAT, where the clause den¬ 
sity can be tuned through a phase transition between 
satisfiable and unsatisfiable phases [40]. The peak we 
observe may be interpreted as a finite-size precursor of 
a phase transition. This interpretation is corroborated 
below by time-to-solution results of all other tested al¬ 
gorithms which will also find problems near the critical 


Note that in our definition of frustration fraction, frustration 
is measured with respect to all edges of the graph from which 
clauses are chosen, similarly to the way clause density is defined 
in SAT problems. 


point the hardest. Indeed, all the algorithms we stud¬ 
ied exhibit qualitatively similar behavior to that seen in 
Fig. 2 (see Fig. in Appendix B), with an easy-hard- 
easy pattern separated at q ~ 0.2. This is in agreement 
with previous studies, e.g., for MAX 2-SAT problems on 
the DW1 [44], and for fc-SAT with k > 2, where a similar 
pattern is found for backtracking solvers [45]. It is impor¬ 
tant to note that we do not claim that this easy-hard-easy 
transition coincides with a spin-glass phase transition; 
we have not studied which phases actually appear in our 
problem set as we tune the clause density. 

A qualitative explanation for the easy-hard-easy pat¬ 
tern is that when the number of loops (and hence a) is 
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DW success probability, f a =20ps 

(a) 



(b) 

FIG. 7. Success probability correlations. The results 
for all instances at a = 0.35 are shown. Each datapoint is 
the success probability for the same instance, (a) for DW2 at 
t a = 20/xs and t a = 40/rs, (b) for DW2 at t a = 20fj,s and SAA 
at 50,000 sweeps and Pf = 5 [in dimensionless units, such that 
max(Jij) = 1]. Perfect correlation means that all data points 
would fall on the diagonal, and a strong correlation is observed 
in both cases. The data is colored by the problem size L and 
shows a clear progression from high success probabilities at 
small L to large success probabilities at small L. Qualitatively 
similar results are seen for t a = 20/iS vs t a = 40/rs at all a 
values, and for DW2 vs SAA at intermediate a values (see 
Figs. 22 and 23 in Appendix 1). 


small they do not overlap and thus each loop becomes an 
easy optimization problem. In the opposite limit many 
loops pass through each edge, thus tending to reduce 
frustration, since each loop contributes either a “frus¬ 
trated” edge with small probability l/£ (where l is the 
loop length) or an “unfrustrated” edge with probability 
1 — 1/i. The hard problems thus lie in between these two 
limits, where a constant fraction (bounded away from 0 
and 1) of loops overlap. 


B. General considerations concerning scaling and 
speedup 


Of central interest is the question of whether there is 
any scaling advantage (with L ) in using a quantum device 
to solve our problems. Therefore, we define a quantum 
speedup ratio for the DW2 relative to a given algorithm 
X as [15] 

S x (X,t a ) = T ™ X< tx\ ) ■ (3) 

i iODW2(A CaJ 

Since this quantity is specific to the DW2 we refer to 
it simply as the empirical “DW2 speedup ratio” from 
now on, though of course it generalizes straightforwardly 
for any other putative quantum annealer or processor 
against which algorithm X is compared. 

We must be careful in using Sbf(A,f a ) in assessing a 
speedup, since the annealing time must be optimized 
for each problem size L in order to avoid the pit- 
fall of a fake speedup [15]. Let us denote the (un¬ 
known) optimal annealing time by t° pt (L). By definition, 
TTSdw 2 (A, t opt (L)) < TTSdw 2 (A, t a ), where t a is a fixed 
annealing time. We now define the optimized speedup ra¬ 
tio as 


5^ pt (A,C t (^)) 


TTS a -(A) 

TTSd W2 (A ,t° pt (L)) 


( 4 ) 


and clearly Sx( A, t a ) < S^ t (X, f° pt (L)), i.e., the speedup 
ratios computed using Eq. (3) are lower bounds on the op¬ 
timized speedup. However, what matters for a speedup 
is the scaling of the speedup ratio with problem size. 
Thus, we are interested not in the numerical value of the 
speedup ratio but rather in the slope dS/dL (recognizing 
that this is a formal derivative since L is a discrete vari¬ 
able). A positive slope would indicate a DW2 speedup, 
while a negative speedup slope would indicate a slow¬ 
down. 

We thus define the DW2 speedup regime as the set 
of problem sizes C + where j^Sx(X,t° pt (L)) > 0 for all 
L £ C + . Likewise, the DW2 slowdown regime is the set 
of problem sizes C~ where j^Sx{X, t° pt (L)) < 0 for all 
i.eC. 

From a computational complexity perspective one is 
ultimately interested in the asymptotic performance, i.e., 
the regime where L becomes arbitrarily large. In this 
sense a true speedup would correspond to the observa¬ 
tion that C+ = [!&„,£+«], with L mi n a positive con¬ 
stant and L+ax —► oo. Of course, such a definition is 
meaningless for a physical device such as the DW2, for 
which L+ ax is necessarily finite. Thus the best we can 
hope for is an observation that C + is as large as is consis¬ 
tent with the device itself, which in our case would imply 
that £ + = [1,8]. However, as we shall argue, we can in 
fact only rule out a speedup, while we are unable to con¬ 
firm one. I.e., we are able to identify C~ = [L“ in , 
but not £ + . 

The culprit, as in earlier benchmarking work [15] and 
as we establish below, is the fact that the DW2 minimum 





annealing time of t a = 20/zs is too long (see also Ap¬ 
pendix I). This means that the smaller the problem size 
the longer it takes to solve the corresponding instances 
compared to the case with an optimized annealing time, 
and hence the observed slope of the DW2 speedup ratio 
should be interpreted as a lower bound for the optimal 
scaling. This is illustrated in Fig. . Without the abil¬ 
ity to identify t° pt (L) we do not know of a way to infer, 
or even estimate C + . However, as we now demonstrate, 
under a certain reasonable assumption we can still bound 

Cr. 

The assumption is that if t a > t° pt then 
j^TTSdw 2 (A, t a ) < ^TTS DW2 (A ,t° pt (L)) for all L < 
L*, the problem size for which t a = t° pt (L*). This 
assumption is essentially a statement that the TTS is 
monotonic in L , as illustrated in Fig. . Next we con¬ 
sider the (formal) derivatives of Eqs. (3) and ( 1): 

&S x (\,t a ) _ a L TTS x (A) d L TTS DW2 (A,t a ) 

Sx(X,t a ) TTS X (A) TTS DW2 (A,t a ) 1 

iE^(A,f° pt (^)) _ ^lTTSx(A) 

S x (\,t° p \L)) TTSx(A) 

_ ^TTS DW2 (A, t° pt (L)) 

TTS DW2 (A,ta pt (T)) 

Collecting these results we have 

( 6 ) 

Therefore, if we find that ^S'x(A, t a ) < 0 in the sub- 
optimal regime where t a > t° pt (L), then it follows that 
j^Sx(^,t° pt (L)) < 0. In other words, a DW2 speedup 
is ruled out if we observe a slowdown using a suboptimal 
annealing time. 

C. Scaling and speedup ratio results 

In Fig. we show the scaling of the median time-to- 
solution for all algorithms studied, for a representative 
set of clause densities. All curves appear to match the 
general dynamic programming scaling for L > 4, i.e., 
TTS(A) ~ exp[6(a)L], but the scaling coefficient 6(a) 
clearly varies from solver to solver. This scaling is simi¬ 
lar to that observed in previous benchmarking studies of 
random Ising instances [15, 20]. 

In Fig. 6 we show the median scaling of Sx for the 
same set of clause densities as shown in Fig. . We ob¬ 
serve that in all cases there is a strong dependence on 
the clause density a, with a negative slope of the DW2 


9 Individual instances may not satisfy this assumption [46, 47], but 
we are not aware of any cases where averaging over an ensemble 
of instances violates this assumption. 
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FIG. 8. Correlation between the DW2 data for two dif¬ 
ferent annealing times and SAA. Plotted is the normal¬ 
ized Euclidean distance D(p\ . p?) for pi and p *2 being, respec¬ 
tively, the ordered success probability for DW2 at t a = 20/rs 
and t a = 40/rs (blue circles), DW2 at t a = 20ps and SAA 
(yellow squares), DW2 at t a = 40ps and SAA (green di¬ 
amonds). For comparison, the Euclidean distance between 
two random vectors with elements £ [0,1] is ~ 0.4. SAA 
data is for 50,000 sweeps and /3f = 5. The correlation with 
SAA degrades slightly for t a = 40ps. Error bars represent 20- 
confidence intervals and were computed using bootstrapping 
(see Appendix for details). In each comparison, to con¬ 
struct pi and p 2 we fixed a and used half the instances (for 
bootstrapping purposes) for L £ [2,8]. 


speedup ratio for the lower clause densities, correspond¬ 
ing to the harder, more frustrated problems. In this 
regime the DW2 exhibits a scaling that is worse than the 
classical algorithms and by Eq. ( ) there is no speedup. 
The possibility of a DW2 speedup remains open for the 
higher clause densities, where a positive slope is observed, 
i.e., the DW2 appears to find the easier, less frustrated 
problems easier than the classical solvers. This apparent 
advantage is most pronounced for a > 0.4, where we ob¬ 
serve the possibility of a potential speedup even against 
the highly fine-tuned HFS algorithm (this is seen more 
clearly in Fig. 0). Moreover, the DW2 speedup ratio 
against HFS improves slightly at the higher percentiles 
(see Fig. in Appendix 3), which is encouraging from 
the perspective of a potential quantum speedup. 


D. Scaling coefficient results 

To test the dependence on t a , we repeated our DW2 
experiments for t a € [20,40]/xs, in intervals of 2/xs. Fig¬ 
ure is a success probability correlation plot between 
t a = 20/xs and t a = 40/xs, at a = 0.35. The correla¬ 
tion appears strong, suggesting that the device might 
already have approached the asymptotic regime where 
increasing t a does not modify the success probabilities. 
To check this more carefully let us first define the normal- 
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FIG. 9. Scaling coefficients of the number of runs. Plot¬ 
ted here is b(a) [Eq. (8)] for the DW2 at all annealing times 
(overlapping solid lines and large symbols), for the HFS al¬ 
gorithm, for SAS with an optimal number of sweeps, for SAS 
with noise and an optimal number of sweeps, and for SAA 
with a large enough number of sweeps that the asymptotic 
distribution has been reached at /?/ = 5. The scaling co¬ 
efficients of HFS and of optimized SAS each set an upper 
bound for a DW2 speedup against that particular algorithm. 
In terms of the scaling coefficient the DW2 result is statisti¬ 
cally indistinguishable (except at a = 0.1) from SAA run at 
S = 50,000 and /3/ = 5. The coefficients shown here are ex¬ 
tracted from fits with L > 4 (see Fig. in Appendix 5). 

Error bars represent 2a confidence intervals. 


FIG. 10. Difference between the scaling coefficients. 

Plotted here is the difference between the scaling coefficients 
in Fig 9, 6 x(o) — 6 dw 2 («), where X denotes the HFS algo¬ 
rithm, SAS with an optimal number of sweeps, SAS with noise 
and an optimal number of sweeps, or SAA with S = 50, 000 
and = 5. When the difference is non-positive there can 
be no speedup since optimizing t a can only increase 6 dw 2 (o); 
conversely, when the difference is positive a speedup is still 
possible, i.e., not accounting for the error bars, for a < 0.05 
and a > 0.4 for HFS, and for a < 0.15 and a > 0.35 for 
SAS without noise. These ranges shrink if the error bars 
are accounted for, but notably, for most a values SAS with 
noise does not disallow a limited speedup, suggesting that con¬ 
trol noise may play an important factor in masking a DW2 
speedup. Error bars represent 2a confidence intervals. 


ized Euclidean distance between two length -M vectors of 
probabilities p-\ and p 2 as 

D(pi,p 2 ) ■= ~^\\pi -P 2 W (7) 

(where ||p|| = y/p-p)] clearly, 0 < D(pi,p 2 ) < 1. The re¬ 
sult for the DW2 data with pi and p 2 being the ordered 
sets of success probabilities for all instances with given 
a at t a = 20/rs and t a = 40/iS respectively, is shown as 
the blue circles in Fig. . The small distance for all a 
suggests that for t a > 20/xs the distribution of ground 
state probabilities has indeed nearly reached its asymp¬ 
totic value. 

This result means that the number of runs r(A) 
[Eq. (2)] does not depend strongly on t a either. To 
demonstrate this we first fit the number of runs to 

r{L,a, 0.5) = e Q(a)+h(a)i , (8) 

in accordance with the abovementioned expectation of 
the scaling with the treewidth, and find a good fit across 
the entire range of clause densities (see Fig. in Ap¬ 
pendix 3). The corresponding scaling coefficient b(a) is 
plotted in Fig. for all annealing times (the constant 
a(a) is shown in Fig. in Appendix and is well- 

behaved); the data collapses nicely, showing that the scal¬ 
ing coefficient b(a) has already nearly reached its asymp¬ 
totic value. Also plotted in Fig. is the scaling coefficient 


b(a) for HFS and for SAS with an optimized numbers of 
sweeps at each a and L, as extracted from the data shown 
in Fig. . 

By Eq. (6), where &dw 2 (ck) > bx(ct) there is no DW2 
speedup (£~ = [4,8]), whereas where frow2(a0 < &x(cO, 
a DW2 speedup over algorithm X is still possible. 

We thus plot the difference in the scaling coefficients in 
Fig. 10. Figures 9 and 10 do allow for the possibility of a 
speedup against both HFS and SAS, at sufficiently high 
clause densities. However, we stress once more that the 
smaller DW2 scaling coefficient may be a consequence 
of t a = 20ps being excessively long, and that we cannot 
rule out that the observed regime of a possible speedup 
would have disappeared had we been able to optimize 
t a by exploring annealing times shorter than 20/iS. Note 
that an optimization of the SAS annealing schedule might 
further improve its scaling, but the same cannot be said 
of the HFS algorithm, and it seems unlikely that it could 
be outperformed even by the fully optimized version of 
SAS. 


E. DW2 vs SAA 

Earlier work has ruled out SAA as a model of the D- 
Wave devices for random Ising model problems [20], as 
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well as for certain Hamiltonians with suppressed and en¬ 
hanced ground states for which quantum annealing and 
SAA give opposite predictions [23, 27]. The observation 
made above, that the DW2 success probabilities have 
nearly reached their asymptotic values, suggests that for 
the problems studied here the DW2 device may perhaps 
be described as a thermal annealer. While we cannot con¬ 
clude on the basis of ground state probabilities alone that 
the DW2 state distribution has reached the Gibbs state, 
we can compare the ground state distribution to that of 
SAA in the regime of an asymptotic number of sweeps, 
and attempt to identify an effective final temperature for 
the classical annealer that allows it to closely describe 
the DW2 distribution. We empirically determine /3f = 5 
to be the final temperature for our SAA simulations that 
gives the closest match, and S = 50,000 (corresponding 
to 150ms, much larger than the DW2’s t a = 20/zs) to be 
large enough for the SAA distribution to have become 
stationary (see Fig. 27(a in Appendix B for results with 
additional sweep numbers confirming this). The result 
for the Euclidean distance measure is shown in Fig. 8 
for the two extremal annealing times, and the distance 
is indeed small. To more closely assess the quality of the 
correlation we select a = 0.35, the value that minimizes 
the Euclidean distance as seen in Fig. 8, and present the 
correlation plot in Fig. . Considering the results for 
each size L separately, it is apparent that the correla¬ 
tion is good but also systematically skewed, i.e., for each 
problem size the data points approximately lie on a line 
that is not the diagonal line. Additional correlation plots 
are shown in Fig. of Appendix 

As a final comparison we also extract the scaling coeffi¬ 
cients b(a) and compare DW2 to SAA in Fig. 9. It can be 
seen that in terms of the scaling coefficients the DW2 and 
SAA results are essentially statistically indistinguishable. 
However, we stress that since the SAA number of sweeps 
is not optimized, one should refrain from concluding that 
the equal scaling observed for the DW2 and SAA rules 
out a DW2 speedup. 


V. DISCUSSION 

In this work we proposed and implemented a method 
for generating problems with a range of hardness, tuned 
by the clause density, mimicking the phase structure ob¬ 
served for SAT problems. By comparing the DW2 device 
to a number of classical algorithms we delineated where 
there is no DW2 speedup and where it might still be 
possible, for this problem set. No advantage is observed 
for the low clause densities corresponding to the hardest 
optimization problems, but a speedup remains possible 
for the higher clause densities. In this sense these re¬ 
sults are more encouraging than for the random Ising 
problems studied in Ref. [15], where only the lowest per¬ 
centiles of the success probability distribution allowed for 
the possibility of a DW2 speedup. In our case there is in 
fact a slight improvement for the higher percentiles (see 



FIG. 11. Scaling of SAA at different final tempera¬ 
tures. SAA is run at S = 50,000 and various final inverse 
temperatures. The peak at a ~ 0.2 remains a robust feature. 
The data marked “Noisy” is with 5% random noise added to 
the couplings Jij. 


Fig. in Appendix ). In the same sense our findings 
are also more encouraging than very recent theoretical re¬ 
sults showing that quantum annealing does not provide 
a speedup relative to simulated annealing on 2SAT prob¬ 
lems with a unique ground state and a highly degenerate 
first excited state [48]. 

The close match between the DW2 and SAA scaling 
coefficients seen in Fig. suggests that SAA in the regime 
of an asymptotic number of sweeps can serve as a model 
for the expected performance of the D-Wave device as 
its temperature is lowered. Thus we plot in Fig. 
the scaling coefficient for SAA at various final inverse 
temperatures, at a fixed (and still asymptotic) value of 
S = 50,000. Performance improves steadily as (3/ in¬ 
creases, suggesting that SAA does not become trapped 
at 50,000 sweeps for the largest problem sizes we have 
studied (this may also indicate that even at the hardest 
clause densities these problems do not exhibit a positive- 
temperature spin-glass phase). We may infer that a sim¬ 
ilar behavior can be expected of the D-Wave device if its 
temperature were lowered. 

An additional interesting feature of the fact that the 
asymptotic DW2 ground state probability observed here 
is in good agreement with that of a thermal annealer, is 
that it gives the ground state with a similar probabil¬ 
ity as expected from a Gibbs distribution. This result 
is consistent with the weak-coupling limit that underlies 
the derivation of the adiabatic Markovian master equa¬ 
tion [49], i.e., it is consistent with the notion that the 
system-bath coupling is weak and decoherence occurs in 
the energy eigenbasis [50]. This rules out the possibil¬ 
ity that decoherence occurs in the computational basis, 
as this would have led to a singular-coupling limit mas¬ 
ter equation with a ground state probability drawn not 
from the Gibbs distribution but rather from a uniform 
distribution, if the single-qubit decoherence time is much 
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shorter than the total annealing time [49]. This is im¬ 
portant, as the weak-coupling limit is compatible with 
decoherence between eigenstates with different energies 
while maintaining ground state coherence, a necessary 
condition for a speedup via quantum annealing. In con¬ 
trast, in the singular-coupling limit no quantum effects 
survive and no quantum speedup of any sort is possible. 

We note that an important disadvantage the DW2 de¬ 
vice has over all the classical algorithms we have com¬ 
pared it with, is control errors in the programming of 
the local fields and couplings [51-54]. As shown in Ap¬ 
pendix (Fig. 8), many of the rescaled couplings Jy 
used in our instances are below the single-coupler con¬ 
trol noise specification, meaning that with some proba¬ 
bility the DW2 is giving the right solution to the wrong 
problem. We are unable to directly measure the effect of 
such errors on the DW2 device, but their effect is demon¬ 
strated in Figs. 10 and LI, where both SAS and SAA 
with /3f = 5,20, respectively, are seen to have substan¬ 
tially increased scaling coefficients after the addition of 
noise that is comparable to the control noise in the DW2 
device. In fact, the DW2 scaling coefficient is smaller 
than the scaling coefficient of optimized SAS with noise 
over almost the entire range of a, suggesting that a re¬ 
duction in such errors will extend the upper bound for a 
DW2 speedup against SAS over a wider range of clause 
densities. The effect of such control errors can be miti¬ 
gated by improved engineering, but also emphasizes the 
need for the implementation of error correction on pu¬ 
tative quantum annealing devices. The beneficial effect 
of such error correction has already been demonstrated 
experimentally [25, 29] and theoretically [55], albeit at 
the cost of a reduction in the effective number of qubits 
and reduced problem sizes. 

To summarize, we believe that at least three major im¬ 
provements will be needed before it becomes possible to 
demonstrate a conclusive (limited or potential) quantum 
speedup using putative quantum annealing devices: (1) 
harder optimization problems must be designed that will 
allow the annealing time to be optimized, (2) decoher¬ 
ence and control noise must be further reduced, and (3) 
error correction techniques must be incorporated. An¬ 
other outstanding challenge is to theoretically design op¬ 
timization problems that can be unequivocally shown to 
benefit from quantum annealing dynamics. 

Finally, the methods introduced here for creating frus¬ 
trated problems with tunable hardness should serve as a 
general tool for the creation of suitable benchmarks for 
quantum annealers. Our study directly illustrates the 
important role that frustration plays in the optimization 
of spin-glass problems for classical algorithms as well as 
for putative quantum optimizers. It is plausible that dif¬ 
ferent, perhaps more finely-tuned choices of clauses to 
create novel types of benchmarks, may be used to es¬ 
tablish a clearer separation between the performance of 
quantum and classical devices. These may eventually 
lead to the demonstration of the coveted experimental 
annealing-based quantum speedup. 


Note added. Work on problem instances similar to 
the ones we studied here, but with a range of couplings 
above the single-coupler control noise specification, ap¬ 
peared shortly after our preprint [56]. The DW2 scaling 
results were much improved, supporting our conclusion 
that such errors have a strong detrimental effect on the 
performance of the DW2. 
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Appendix A: Methods 

In this section we describe various technical details re¬ 
garding out methods of data collection and analysis. 


1. Experimental details 

The DW2 is marketed by D-Wave Systems Inc. as 
a quantum annealer, which evolves a physical system 
of superconducting flux qubits according to the time- 
dependent Hamiltonian 

H(t) = A(t) ^2 of + B(t)Hi sing , t £ [0, t a ] , (Al) 

i&V 

with F/ising given in Eq. (l). The annealing schedules 
given by A(t) and B(t) are shown in Fig. 12. Our ex¬ 
periments used the DW2 device housed at the USC In¬ 
formation Sciences Institute, with an operating temper¬ 
ature of 17mK. The Chimera graph of the DW2 used 
in our work is shown in Figure 13. Each unit cell is 
a balanced K 44 bipartite graph. In the ideal Chimera 
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FIG. 12. Annealing schedule of the DW2. The annealing 
curves A(t) and B(t) are calculated using rf-SQUID models 
with independently calibrated qubit parameters. Units of h = 
1. The operating temperature of 17mK is also shown. 


graph (of 512 qubits) the degree of each vertex is 6 (ex¬ 
cept for the corner unit cells). In the actual DW2 de¬ 
vice we used 503 qubits were functional. For the scaling 
analysis we considered L x L square sub-lattices of the 
Chimera graph, and restricted our simulations and tests 
on the DW2 to the subset of functional qubits within 
these subgraphs, denoted Cl (see Figure 13). For each 
clause density a and problem size N (or Cl) we generated 
100 instances, for a total of 12,600 instances. We per¬ 
formed approximately 990000/is/t a annealing runs (ex¬ 
periments) for each problem instance and for each an¬ 
nealing time t a £ [20,40]/zs, in steps of 2/xs, for a total 
of more than 10 10 experiments. No gauge averaging [20] 
was performed because we are not concerned with the 
timing data for a single instance but a collection of in¬ 
stances, and the variation over instances is larger than 
the variation over gauges. 


2. Algorithms 

a. HFS algorithm 

The HFS algorithm is due to Hamze & Freitas [41] and 
Selby [43]. This is a tree-based optimization algorithm, 
which exploits the sparsity and local connections of the 
Chimera graph to construct very wide induced trees and 
repeatedly optimizes over such trees until no more im¬ 
provement is likely. 

We briefly discuss the tree construction of the HFS al¬ 
gorithm. It considers each part of the bipartite unit cell 
as a single 2 4 -dimensional vertex instead of four distinct 
2-dimensional vertices, resulting in a graph as depicted 
in Fig. . The tree represented by the dark vertices cov¬ 
ers 78% of the graph, and such trees will cover 75% of 



FIG. 13. The DW2 Chimera graph. The qubits or spin 
variables occupy the vertices (circles) and the couplings Jij 
are along the edges. Of the 512 qubits, 503 were operative 
in our experiments (green circles) and 9 were not (red cir¬ 
cles). We utilized subgraphs comprising L x L unit cells, 
denoted Cl, indicated by the solid black lines. There were 
31, 70,126,198, 284, 385, 503 qubits in our C 2 , ■ ■ ., Cg graphs, 
respectively. 


the graph in the limit of infinitely large Chimera graphs. 
Finding the minimum energy configuration of such a tree 
conditioned on the state of the rest of the graph can be 
done in O(N) time where N is the number of vertices. 
Since the tree encodes so much of the graph, optimizing 
over such trees can quickly find a low-lying energy state. 
This is the strategy behind Prog-QUBO [58]. Since each 
sample takes a variable number of trees, we estimate time 
to solution as as the average number of trees multiplied 
by the number of operations per tree, with a time con¬ 
stant of 0.5 fxs per operation. 

Prog-QUBO runs in a serial manner on a single core of 
a CPU. Since we allow the DW2 to use O(N) resources, 
we should also parallelize the HFS algorithm. In prin¬ 
ciple, HFS proceeds in a series of steps — at each step, 
one shears off each of the leaves of the tree (i.e., the out¬ 
ermost vertices) and then proceeds to the next step until 
the tree has collapsed to a point. Each of the leaves can 
be eliminated separately, however each elimination along 
a particular branch is dependent on the previous elim¬ 
inations. As a result, we must perform between L + 2 
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b. Simulated Annealing 



FIG. 14. An example of the view of the Chimera graph in 
the HFS algorithm. Each vertex is one half of a unit cell, 
and is 2 4 -dimensional. The algorithm repeatedly finds the 
minimum energy configuration of the graph over trees (like 
the one highlighted) given the state of the remaining vertices 
(in gray). 


Our SA algorithm uses a single spin-flip Metropolis 
update method. In a single sweep, each spin is updated 
once according to the Metropolis rule: the spin is flipped, 
the change in energy A E is calculated. If the energy is 
lowered, the flip is accepted, and if not, it is accepted 
with a probability given by the Metropolis probability: 

-FWt = min (1, exp(—/3A.E)) (A2) 

A linear annealing schedule in /? is used; Starting at Pi = 
0.01, we increment p in steps of 5(3 = ((3 / — Pi)/(S — 1) 
where S is the number of sweeps, up to (3f. Each instance 
is repeated 10 4 times. 

c. SSSV 

The SSSV model was first proposed [26] as a classical 
model that reproduced the success probabilities of the 
DW1 device studied in Ref. [20], although there is 
growing evidence that this model fails to capture the 
behavior of the device for specific instances [27, 32, 34], 
The model can be understood as describing coherent 
single qubits interacting incoherently by replacing qubits 
by 0(2) rotors; the Hamiltonian can be generated by 
replacing erf i-T sin0j and erf i-T cos The system is 
then “evolved” by Monte Carlo updates on the angles 
6 i S [0,7r]. Although SSSV is not designed to be a fast 
solver, we studied it here as a potential classical limit of 
the DW2 and checked what the scaling of such a classical 
limit would be. The DW2 annealing schedule in Fig. 
was used, and the temperature was kept constant at 
10.56mK. Each instance was repeated 10 4 times. 


and §L + 2 (average |i + 2) irreducibly serial operations 
when reducing the trees on an L x L unit cell Chimera 
graph (depending on which of the trees we use and the 
specifics for how we do the reduction). Since we deal only 
with square graphs and are concerned with asymptotic 
scaling, we ignore the constant steps and use | L basic 
operations per tree (see Sec. A3b for additional details). 

We ran Prog-QUBO in mode “-S3” [58], meaning that 
we used maximal induced trees (treewidth 1 in our case). 
Other options enable reduction over lines or treewidth 
2 subgraphs, but we did not use those options here. In 
principle, one could define a new solver by optimizing 
the treewidth of subgraphs over which to reduce for each 
problem size, which will likely perform better than only 
using treewidth 1 for arbitrary problem sizes. However, 
this was not investigated in this work. 


d. Simulated Quantum Annealing 

SQA [10, 11] is an annealing algorithm based on 
discrete-time path-integral quantum Monte Carlo sim¬ 
ulations of the transverse field Ising model but using 
Monte Carlo dynamics instead of the open system evo¬ 
lution of a quantum system. This amounts to sampling 
the world line configurations of the quantum Hamiltonian 
( i) while slowly changing the couplings. SQA has been 
shown to be consistent with the input/output behavior 
of the DW1 for random instances [20], and we accord¬ 
ingly used a discrete-time quantum annealing algorithm. 
We always used 64 Trotter slices and the code was run at 
p = 10 (in dimensionless units, such that max(| Jy |) = 1) 
with linearly decreasing and increasing A(t) and B(t), 
respectively. Cluster updates were performed only along 
the imaginary time direction. A single sweep amounts 
to the following: for each space-like slice, a random spin 
along the imaginary time direction is picked. The neigh¬ 
bors of this spin are added to the cluster (assuming they 
are parallel) according to the Wolff algorithm [59] with 
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probability 1 — e _2Jj -, where J± = —0.5 In [tanh A(t)] 
is the spin-spin coupling along the imaginary time di¬ 
rection. When the cluster construction terminates, the 
cluster is flipped according to the Metropolis probability 
using the change in energy along the space-like direction 
associated with flipping the cluster. Therefore a single 
sweep involves a single cluster update for each space-like 
slice. For every problem instance, we performed 10 3 rep¬ 
etitions in order to estimate the instance success rates. 


e. Constraint Solver Algorithm 

Here we outline the algorithm we developed and used 
to search for, and list, solutions to a specified planted- 
solution Hamiltonian, taking advantage of the fact that 
the total cost function is a sum of easily-solvable cost 
functions each defined on a finite number of bits. Since 
in our case each term in the Hamiltonian is a frustrated 
loop, it is easy to list the constraints each loop induces on 
any potential solution of the total Hamiltonian. The al¬ 
gorithm is an exhaustive search and is an implementation 
of the Bucket Elimination Algorithm (see, e.g., Ref. [60]). 

The problem structure consists of a set of values (or 
bits) where each value may be +1 or —1. A set of con¬ 
straints restricts subsets of the bits to certain values. The 
problem is to assign values to all the bits so as to satisfy 
all the constraints. 

Each constraint (in this case, the optimizing configu¬ 
rations of loop Hamiltonians) applies to a subset of the 
bits, and contains a list of allowed settings for that subset 
of bits. The constraint is met if the values of the bits in 
the subset matches one of the allowed settings. 

The Bucket Elimination Algorithm as applied to this 
problem consists of the following steps. First is the con¬ 
straint elimination stage: (i) Select a bit to eliminate; (ii) 
Save constraints which contain selected bit; (iii) Combine 
all constraints which contain selected bit; (iv) Generate 
new constraints without selected bit; (v) If combined con¬ 
straints contain a contradiction, exit; (vi) Repeat until all 
bits have been addressed. 

The second part of the algorithm tackles the enumera¬ 
tion of the solutions. It involves using the tables created 
in the constraint elimination stage to grow solution sets 
one bit at a time. The last table created in the constraint 
satisfaction stage contains tables for a single bit (the last 
to be eliminated). If it has no entries, no solutions are 
possible for the processed constraint set. If there is a sin¬ 
gle entry, it indicates the value that the bit must be — 1, 
or +1. If there are two entries, then both —1 and +1 are 
possible values. A solution set is built listing the values 
of the final bit that was eliminated. The table created 
for the final two bits is processed next. It contains all the 
legal values for the final two bits. For each allowed value 
of the final bit, a solution is generated for the final two 
bits, resulting in a new solution set for the final two bits. 
The bit solution tables generated during the elimination 
phase are processed in reverse order, each time adding a 


single bit, combining the solutions from the previous step 
with the single bit, generating a new solution set with a 
bit added. Ultimately all of the bit solution tables gen¬ 
erated during the elimination phase are processed and 
solutions with values for all bits are generated. The logic 
driving the elimination phase ensures that it is always 
possible to combine the set of solutions with the next 
bit. 

This algorithm is guaranteed to find all solutions, but 
may exceed time and memory constraints. All steps are 
well defined except for the step to “select a bit to elim¬ 
inate” . The order in which bits are eliminated dramati¬ 
cally affects the time and memory required. Determina¬ 
tion of the optimal order to eliminate bits is known to be 
NP complete. A more detailed description of the algo¬ 
rithm (including examples) will be published elsewhere 
as part of another study in the near future. 


3. Error estimation 


a. Annealers 

For the annealers, the time to solution can be trivially 
found by applying a function T(p ) to the probability of 
success p. In the main text, the T(p) used is the time 
required to find the ground state at least once with a 
probability 0.99: 


Tip) 


log(l - 0.99) 
log(l - p) 


(A3) 


For the purposes of the discussion below this function can 
in fact be totally arbitrary and may depend on system 
size. 

We imagine our annealing algorithms as essentially a 
sequence of binary trials, where each run is either a suc¬ 
cess or a failure. Because we do not have infinitely many 
observations, there is some uncertainty in our estimate of 
the true probability of success of our binary trial. Since 
our data was collected as a series of r independent trials 
(which varies by solver) for each solver and instance, the 
annealers’ results are described by a binomial distribu¬ 
tion. 

In order to determine what the probability of success 
is, we assumed a Bayesian stance — we do not know 
what the probability is currently and so must assume 
some prior distribution for our belief, and will update 
our belief based on the evidence. Which prior should we 
choose? The obvious choice is a /3 distribution as it is 
the conjugate prior for the binomial distribution, giving 
us a closed form for our posterior distribution [61]. We 
have chosen the Jeffreys prior for the binomial distribu¬ 
tion, /3(|, ;j) for two reasons: 1) It is invariant under 
reparameterization of the parameter space, 2) it is the 
prior which maximizes the mutual information between 
the sample and the parameter over all continuous, pos¬ 
itive priors. In other words, it yields the same prior no 
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matter how we parameterize our space and 5 ) max¬ 
imizes the amount of information gained by learning the 
data. The other obvious prior is the uniform distribution 
or /3 (1, 1 ) but it is not invariant under reparameteriza¬ 
tion and learns less from the data than [62], For 

these reasons, /?(|, |) tends to be the standard choice of 
prior for binomial distributions [63] . 

After Bayesian updating by our prior, our probability 
distribution for p is /3(x + |, r — x + |), where x is the 
number of successes and r the number of runs. 

There may be a concern that our solvers do not fully 
conform to a binomial distribution: if there are correla¬ 
tions between successive runs then our empirical success 
probability f would be inflated for hard problems. How¬ 
ever, since we did not observe an advantage for the DW2 
over the classical algorithms for the hardest problems, 
and the only potential advantage we observed is for the 
easier problems at high clause density, we do not believe 
this issue is affecting our conclusions. 

Thus, for all instances X\ £ {X} in a particular range 
of interest (for example, a particular problem size and 
clause density), we have some distribution /3 'j for the 
probability of success for that instance. To generate our 
error bars we used the bootstrap method, which we de¬ 
scribe next. 

If we have M instances in our set of interest, then 
we resample with replacement a “new” set of instances 
{Xi]j , also of length M, from our set X. For each instance 
Xi j from this new set we sample a value for its probability 
from its corresponding distribution fix to get a set of 
probabilities {pij}- We then calculate whatever function 
Fj = f{{pij}) we wish on these probabilities, e.g., the 
median over the set of instances. We repeat this process 
a large number of times (in our case, 1000 ), to have many 
values of our function {Fj}. We then take the mean and 
standard deviation over the set {Fj} to get a value F and 
a standard deviation crp , which form our value and error 
bar for that size and clause density. 

When we take the ratio of two algorithms A and B , for 
each pair of corresponding data points for the two algo¬ 
rithms (i.e., each size and clause density) we characterize 
each point with a normal distributions J\f(pA, &a) and 
N^pbt^b)- We then resample from each distribution a 
large number of times ( 1000 ) to get two sets of values 
for the function we wish to plot F: {Fa} and {Fb}. We 
take the ratio of the corresponding elements in the two 
sets Sj = Fa^/Fb^. and take the mean and standard de¬ 
viation of the set {Si} to get our data point and error 
bar for the ratio. 


b. HFS algorithm 

For the HFS algorithm, time to solution for each prob¬ 
lem is computed as the mean number of trees per sample 
multiplied by 0.625^s x L for a Cl problem. The number 
0.625 is chosen to approximate the times on a standard 
laptop, and the scaling with L is due to the fact that, in 


a parallel setting, the number of steps to reduce a tree is 
linear in L (exactly, it is |L + 2, but the 2 serves only to 
mask asymptotic scaling and is thus not included in our 
TTS or speedup plots or the scaling analysis). 


c. Euclidean distance 

In order to generate the error bars in Fig. 8, instead of 
calculating the Euclidean distance over the total number 
of instances at a given a (700 total), we calculated the 
Euclidean distance over half the number of instances (350 
total). We were then able to perform 100 bootstraps over 
the instances, i.e., we picked 350 instances at random for 
each Euclidean distance calculation. For each bootstrap, 
we calculated the Euclidean distance, and the data points 
in Fig. are the mean of the Euclidean distances over the 
bootstraps, while the error bars are twice the standard 
deviation of the Euclidean distances. 


d. Difference in slope 

In order to generate the error bars in Fig. 10, we used 
the data points and the error bars in Fig. as the mean 
and twice the standard deviation of a Gaussian distribu¬ 
tion. We then took 1000 samples from this distribution 
and calculated their differences. The means of the differ¬ 
ences are the data points in Fig. 10, and the error bars 
are twice the standard deviation of the differences. 


Appendix B: Additional results 

In this section we collect additional results in support 
of the main text. 


1. Degeneracy-hardness correlation 

It is known that a non-degenerate ground state along 
with an exponentially (in problem size) degenerate first 
excited state leads to very hard SAT-type optimization 
problems [64]. Here we focus on the ground state de¬ 
generacy and ask whether it is correlated with hardness. 
We show the ground state degeneracy in Fig. 5. It de¬ 
cays rapidly as a grows, and for sufficiently large a (that 
depends on the problem size) the ground state found is 
unique (up to the global Z 2 symmetry). This suggests 
that degeneracy is not necessarily correlated with hard¬ 
ness, since in the main text we found that hardness peaks 
at a ~ 0.25. To this test directly we restrict ourselves to 
L = 8 and a = 0.4. We then bin the 100 instances at this 
a according to their degeneracy and study their median 
TTS using the HFS algorithm. We show the results in 
Fig. 16, where we see no correlation between degeneracy 
and hardness for a fixed a. We find a similar result when 
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0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

a 


FIG. 15. Ground state degeneracy. Number of unique 
solutions as a function of clause density for different Chimera 
subgraph sizes Cl- As the clause density is increased, the 
number of unique solutions found decreases to one (up to the 
global bit flip symmetry). Shown is the median degeneracy, 

i.e., we sort the degeneracies of the 100 instances for each 
value of L and a, and find the median. Our procedure counts 
the degenerate solutions and stops when it reaches 10 5 solu¬ 
tions. If the median has 10 5 solutions then we assume that 
not all solutions were found and hence the degeneracy for that 
value of L and a is not plotted. These are solutions on the 
used qubits (e.g., there are many instances for each a at L = 8 
that use < 503 qubits); to account for the n uq unused qubits 
we multiply the degeneracy by 2 nuq . 


we use the success probability of the DW2 as the metric 
for hardness. 


2. Additional easy-hard-easy transition plots 

The universal nature of the scaling behavior can be 
seen in Fig. 17, complementing Fig. 2 with results for 
the 25th and 75th percentiles respectively. The peak in 
the TTS near a = 0.2 is a feature shared by all the solvers 
we considered. 


3. Optimality plots 

The absence of an optimal DW2 annealing time was 
discussed in detail in the main text, along with the op¬ 
timality of the number of sweeps of the classical algo¬ 
rithms. Figure 18 illustrates this: a clear lower envelope 
is formed by the different curves plotted for SQA, SA, 
and SSSV, from which the optimal number of sweeps at 
each size can be easily extracted. Unfortunately no such 
envelope is seen for the DW2 results [Fig. ], leading 
to the conclusion that t a = 20/iS is suboptimal. 

A complementary perspective is given by Fig. 19, 
where we plot the TTS as a function of the number of 
sweeps, for a fixed problem size L = 8. It can be seen 
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FIG. 16. Scatter plot of the TTS for the HFS algorithm and 
the degeneracy at L = 8 and a = 0.4 (100 instances total). 
Even though there is a wide range of degeneracy over several 
orders of magnitude, we do not observe any trend in the TTS. 
The degeneracy accounts for the fact that some qubits are not 
coupled into the problem (e.g., if n qubits are not specified 
for that particular problem, then the degeneracy is 2 n times 
the directly counted degeneracy). The Pearson correlation 
coefficient is —0.046. 
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that the classical algorithms all display a minimum for 
each clause density. The DW2 curves all slope upward, 
suggesting that the minimum lies to the left, i.e., is at¬ 
tained at t a < 20yts. We note that in an attempt to ex¬ 
tract an optimal annealing time we tried to fit the DW2 
curves to various functions inspired by the shape of the 
classical curves, but this proved unsuccessful since the 
DW2 curves essentially differ merely by a factor of t a , as 
discussed in the main text. 

We can take this a step further and use these optimal 
number of sweeps results to demonstrate that the prob¬ 
lems we are considering here really are hard. To this end 
we plot in Fig. 20 the optimal number of sweeps as a 
function of problems size for SAA. We observe that cer¬ 
tainly for the smaller clause densities the optimal number 
of sweeps s op t appears to scale exponentially in L , which 
indicates that the TTS (which is proportional to s op t) 
will also grow exponentially in L. 


4. Additional speedup ratio plots 


To test whether our speedup ratio results depend 
strongly on the percentile of the success probability dis¬ 
tribution, we recreate Fig. 6 for the 25th and 75th per¬ 
centiles in Fig. 21. The results are qualitatively similar, 
with a small improvement in the speedup ratio relative 
to HFS at the higher percentile. 
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FIG. 17. Comparison of the 25th (left) and 75th (right) percentiles of the TTS (log scale) for all algorithms as a function of 
clause density a. The different colors represent the different Chimera sizes tested. All solvers show a peak at the same density 
value of a ~ 0.2. 
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(c) SA 


(d) SSSV 


FIG. 18. Suboptimal annealing time and optimal sweeps for a = 0.2. Plotted is the TTS (log scale) as a function of size 
L for (a) the DW2, with all available annealing times, (b) SQA, (c) SA, and (d) SSSV, for many different sweep numbers. The 
lower envelope gives the scaling curves shown in Fig. for a = 0.2. The TTS curves flatten at high L for the following reason: 
each classical annealer was run Nx times (Asa = Asssv = 10 4 , Asqa = 10 3 ), and our /3 distribution is ,0(0.5, Nx + 0.5) for 0 
successes, which has an average value of ~ 1/(2 Nx)- This reflects the (Bayesian) information acquired after Nx runs with 0 
successes (one would not expect the probability to be 0). The flattening has no impact on the scale of the optimal number of 
sweeps. 


5. Additional correlation plots 


6. Additional scaling analysis plots 


To complement Fig. 7, we provide correlation plots for 
both the DW2 against itself at t a = 20p,s and t, a = 40p,s 
(Fig. 22), the DW2 vs SAA (Fig. 23), the DW2 vs SQAA 
(Fig. 2' ), and SAA vs SQAA (Fig. 25. The DW2 against 
itself displays an excellent correlation at all clause densi¬ 
ties, while the DW2 vs SAA and DW2 vs SQAA con¬ 
tinues to be skewed at low and high clause densities. 
Recall that Fig. reffig:Euclid-dist provides an objective 
Euclidean distance measure that is computed using all 
problem sizes and depends only on the clause density. 


We provide a few additional plots in support of the 
scaling analysis presented in the main text. 

Figure 26(a) shows the number of runs at different 
problem sizes and clause densities, and the correspond¬ 
ing least-squares fits. It can be seen that the straight 
lines fits are quite good. The slopes seen in this figure 
are the 6(a) values for t a = 20fj,s plotted in Fig. 9; the in¬ 
tercepts are plotted in Fig. for all annealing times, 

and collapse nicely, just like the 6(a). 

Finally, Fig. is a check of the convergence of 

SAA to its asymptotic scaling coefficient as the number 
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FIG. 19. TTS (log scale) for L = 8 as a function of number of sweeps for DW2, SQA, SA, and SSSV used to identify the 
optimal number of sweeps. 


of sweeps is increased from 5, 000 to 50,000. Convergence 
is apparent within the 2 cr error bars. 

7. SQAS vs SAS 

In the main text we only considered SQA as an an¬ 
nealer since that is a more faithful representation of the 
DW2. Here we present a comparison of SQA as a solver 
(SQAS), where we keep track of the lowest energy found 
during the entire anneal, with SAS. We present the scal¬ 
ing coefficient 5(a) from Eq. ( ) of these two solvers in 
Fig. . SAS has a smaller scaling coefficient than 

SQAS for the large a values, but at small a values we can¬ 
not make a conclusive determination because of the sub¬ 
stantial overlap of the error bars. We note that Ref. [65] 
reported that discrete-time SQA (the version used here) 
can exhibit a scaling advantage over SA but that this ad¬ 
vantage vanishes in the continuous-time limit. We have 


not explored this possibility here. 

8. Scale factor histograms 


We analyze the effect of increasing clause density and 
problem size on the required precision of the couplings. 
Fig. shows a trend of scaling factor increasing as a 
increases for fixed L , and as L increases for fixed a. In¬ 
creased scaling factor has the effect of relatively ampli¬ 
fying control error and thermal effects in DW2, and can 
therefore contribute to a decline in performance for the 
larger problems studied. However, recall that the region 
where a speedup is possible according to our results is 
in fact that of high clause densities. Thus, whatever the 
effect of precision errors is, it does not appear to heav¬ 
ily impact the DW2’s performance in the context of our 
problems. The same is true for SAS, since as can be 
seen in Fig. 10, the scaling coefficient is unaffected by 
the addition of noise when a > 0 . 6 . 
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FIG. 20. Scaling of the SAA optimal number of sweeps. The optimal number of sweeps is extracted for each L from 
Fig. . The scaling is roughly exponential for the smaller a values, and appears to be close to exponential for the larger a 
values. Lines are guides to the eye. 
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FIG. 21. The speedup ratios (log scale) for the 25th and 75th percentile of time-to-solution as a function of system size for 
various a. The different colors denote a representative sample of clause densities. 
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FIG. 22. Success probability correlations. The results for all instances at all a values are shown for the DW2 data at 
t a = 20/rs and t a = 40/xs. This complements Fig. 7; the color scheme corresponds to different sizes L as in that figure. 
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FIG. 23. Success probability correlations. The results for all instances at all a values are shown for the DW2 data at 
t a = 20/xs and SAA with S = 50,000 and f3f = 5. This complements Fig. 7; the color scheme corresponds to different sizes 
L as in that figure. The correlation gradually improves from poor for the lowest clause densities to strong at a = 0.35, then 
deteriorates again. 
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FIG. 24. Success probability correlations. The results for all instances at all a values are shown for the DW2 data at 
t a = 20jus and SQAA with S = 10,000 and f3 = 5. This complements Fig. 7; the color scheme corresponds to different sizes 
L as in that figure. The correlation gradually improves from poor for the lowest clause densities to strong at a = 0.35, then 
deteriorates again. 
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FIG. 25. Success probability correlations. The results for all instances at all a values are shown for the SQAA data at 
S = 10,000 and j3 = 5 and SAA with S = 50,000 and fif = 5. This complements Fig. 7; the color scheme corresponds to 
different sizes L as in that figure. The correlation gradually improves from poor for the lowest clause densities to strong at 
a = 0.35, then deteriorates again. Note that we did not attempt to optimize the correlations between the two methods. 
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f a =20ps 




FIG. 26. Exponential fits to the DW2 number of runs. In accordance with Eq. (8), the least-squares linear fits to 
log[r(L, a, 0.5)] are plotted in (a) for t a = 20/j.s. To reduce finite-size scaling effects we exclude L = 2,3 and perform the fit 
for L > 4. The intercept of the linear fits is the the DW2 scaling constant a(a) from Eq. (8), and is shown in (b). Error bars 
represent 2 a confidence intervals. 



FIG. 27. (a) The DW2 scaling coefficients b(a) from Eq. (7) for SAA at various sweep numbers and /3f = 5, along with the 

DW2 scaling coefficients for all t a s we tried, (b) The SQAS and SAS scaling coefficient b(a) at the optimal number of sweeps 
for each. SAS had a final temperature of /3/ = 5 and SQAS was operated at (3 = 5. 
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FIG. 28. Histograms of the scale factor of the instances as a function of system size for various values of a. All problems 
passed to DW2 must have all coupler in the range [—1,1], so all couplers in a problem are rescaled down by a factor equal to 
the maximum absolute value of the couplers in the problem (and hence this quantity is called the scale factor). Since internal 
control error (ICE) is largely instance-independent, the larger the scaling factor of an instance, the worse the relative impact 
of ICE will be. We see a drift to larger scaling factors for with increasing size L and increasing clause density a. Larger values 
of a obviously will have larger scale factors as there are on average more loops per qubit (and thus, per edge) and thus a larger 
maximum potential coupler strength. Since the edges included in a loop are generated randomly, the more edges available at 
a fixed clause density, the more opportunities for a single edge to be included in many loops by chance, resulting the average 
scale factor to drift upward as function of problem size. 


